31 research outputs found
Heterogeneous graph learning for explainable recommendation over academic networks
With the explosive growth of new graduates with research degrees every year, unprecedented challenges arise for early-career researchers to find a job at a suitable institution. This study aims to understand the behavior of academic job transition and hence recommend suitable institutions for PhD graduates. Specifically, we design a deep learning model to predict the career move of early-career researchers and provide suggestions. The design is built on top of scholarly/academic networks, which contains abundant information about scientific collaboration among scholars and institutions. We construct a heterogeneous scholarly network to facilitate the exploring of the behavior of career moves and the recommendation of institutions for scholars. We devise an unsupervised learning model called HAI (Heterogeneous graph Attention InfoMax) which aggregates attention mechanism and mutual information for institution recommendation. Moreover, we propose scholar attention and meta-path attention to discover the hidden relationships between several meta-paths. With these mechanisms, HAI provides ordered recommendations with explainability. We evaluate HAI upon a real-world dataset against baseline methods. Experimental results verify the effectiveness and efficiency of our approach. © 2021 ACM
Matching algorithms : fundamentals, applications and challenges
Matching plays a vital role in the rational allocation of resources in many areas, ranging from market operation to people's daily lives. In economics, the term matching theory is coined for pairing two agents in a specific market to reach a stable or optimal state. In computer science, all branches of matching problems have emerged, such as the question-answer matching in information retrieval, user-item matching in a recommender system, and entity-relation matching in the knowledge graph. A preference list is the core element during a matching process, which can either be obtained directly from the agents or generated indirectly by prediction. Based on the preference list access, matching problems are divided into two categories, i.e., explicit matching and implicit matching. In this paper, we first introduce the matching theory's basic models and algorithms in explicit matching. The existing methods for coping with various matching problems in implicit matching are reviewed, such as retrieval matching, user-item matching, entity-relation matching, and image matching. Furthermore, we look into representative applications in these areas, including marriage and labor markets in explicit matching and several similarity-based matching problems in implicit matching. Finally, this survey paper concludes with a discussion of open issues and promising future directions in the field of matching. © 2017 IEEE. **Please note that there are multiple authors for this article therefore only the name of the first 5 including Federation University Australia affiliate “Jing Ren, Xia Feng, Nargiz Sultanova" is provided in this record*
MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation
We present MosaicFusion, a simple yet effective diffusion-based data
augmentation approach for large vocabulary instance segmentation. Our method is
training-free and does not rely on any label supervision. Two key designs
enable us to employ an off-the-shelf text-to-image diffusion model as a useful
dataset generator for object instances and mask annotations. First, we divide
an image canvas into several regions and perform a single round of diffusion
process to generate multiple instances simultaneously, conditioning on
different text prompts. Second, we obtain corresponding instance masks by
aggregating cross-attention maps associated with object prompts across layers
and diffusion time steps, followed by simple thresholding and edge-aware
refinement processing. Without bells and whistles, our MosaicFusion can produce
a significant amount of synthetic labeled data for both rare and novel
categories. Experimental results on the challenging LVIS long-tailed and
open-vocabulary benchmarks demonstrate that MosaicFusion can significantly
improve the performance of existing instance segmentation models, especially
for rare and novel categories. Code will be released at
https://github.com/Jiahao000/MosaicFusion.Comment: GitHub: https://github.com/Jiahao000/MosaicFusio
Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation
In this work, we focus on open vocabulary instance segmentation to expand a
segmentation model to classify and segment instance-level novel categories.
Previous approaches have relied on massive caption datasets and complex
pipelines to establish one-to-one mappings between image regions and words in
captions. However, such methods build noisy supervision by matching non-visible
words to image regions, such as adjectives and verbs. Meanwhile, context words
are also important for inferring the existence of novel objects as they show
high inter-correlations with novel categories. To overcome these limitations,
we devise a joint \textbf{Caption Grounding and Generation (CGG)} framework,
which incorporates a novel grounding loss that only focuses on matching object
nouns to improve learning efficiency. We also introduce a caption generation
head that enables additional supervision and contextual modeling as a
complementation to the grounding loss. Our analysis and results demonstrate
that grounding and generation components complement each other, significantly
enhancing the segmentation performance for novel classes. Experiments on the
COCO dataset with two settings: Open Vocabulary Instance Segmentation (OVIS)
and Open Set Panoptic Segmentation (OSPS) demonstrate the superiority of the
CGG. Specifically, CGG achieves a substantial improvement of 6.8% mAP for novel
classes without extra data on the OVIS task and 15% PQ improvements for novel
classes on the OSPS benchmark.Comment: ICCV-202
OV-VG: A Benchmark for Open-Vocabulary Visual Grounding
Open-vocabulary learning has emerged as a cutting-edge research area,
particularly in light of the widespread adoption of vision-based foundational
models. Its primary objective is to comprehend novel concepts that are not
encompassed within a predefined vocabulary. One key facet of this endeavor is
Visual Grounding, which entails locating a specific region within an image
based on a corresponding language description. While current foundational
models excel at various visual language tasks, there's a noticeable absence of
models specifically tailored for open-vocabulary visual grounding. This
research endeavor introduces novel and challenging OV tasks, namely
Open-Vocabulary Visual Grounding and Open-Vocabulary Phrase Localization. The
overarching aim is to establish connections between language descriptions and
the localization of novel objects. To facilitate this, we have curated a
comprehensive annotated benchmark, encompassing 7,272 OV-VG images and 1,000
OV-PL images. In our pursuit of addressing these challenges, we delved into
various baseline methodologies rooted in existing open-vocabulary object
detection, VG, and phrase localization frameworks. Surprisingly, we discovered
that state-of-the-art methods often falter in diverse scenarios. Consequently,
we developed a novel framework that integrates two critical components:
Text-Image Query Selection and Language-Guided Feature Attention. These modules
are designed to bolster the recognition of novel categories and enhance the
alignment between visual and linguistic information. Extensive experiments
demonstrate the efficacy of our proposed framework, which consistently attains
SOTA performance across the OV-VG task. Additionally, ablation studies provide
further evidence of the effectiveness of our innovative models. Codes and
datasets will be made publicly available at https://github.com/cv516Buaa/OV-VG
Transformer-Based Visual Segmentation: A Survey
Visual segmentation seeks to partition images, video frames, or point clouds
into multiple segments or groups. This technique has numerous real-world
applications, such as autonomous driving, image editing, robot sensing, and
medical analysis. Over the past decade, deep learning-based methods have made
remarkable strides in this area. Recently, transformers, a type of neural
network based on self-attention originally designed for natural language
processing, have considerably surpassed previous convolutional or recurrent
approaches in various vision processing tasks. Specifically, vision
transformers offer robust, unified, and even simpler solutions for various
segmentation tasks. This survey provides a thorough overview of
transformer-based visual segmentation, summarizing recent advancements. We
first review the background, encompassing problem definitions, datasets, and
prior convolutional methods. Next, we summarize a meta-architecture that
unifies all recent transformer-based approaches. Based on this
meta-architecture, we examine various method designs, including modifications
to the meta-architecture and associated applications. We also present several
closely related settings, including 3D point cloud segmentation, foundation
model tuning, domain-aware segmentation, efficient segmentation, and medical
segmentation. Additionally, we compile and re-evaluate the reviewed methods on
several well-established datasets. Finally, we identify open challenges in this
field and propose directions for future research. The project page can be found
at https://github.com/lxtGH/Awesome-Segmenation-With-Transformer. We will also
continually monitor developments in this rapidly evolving field.Comment: Work in progress. Github:
https://github.com/lxtGH/Awesome-Segmenation-With-Transforme
Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation
<p>Open Vocabulary Instance Segmentation</p></jats:p
Recent applications and chiral separation development based on stationary phases in open tubular capillary electrochromatography (2019–2022)
Capillary electrochromatography (CEC) plays a significant role in chiral separation via the double separation principle, partition coefficient difference between the two phases, and electroosmotic flow-driven separation. Given the distinct properties of the inner wall stationary phase (SP), the separation ability of each SP differs from one another. Particularly, it provides large room for promising applications of open tubular capillary electrochromatography (OT-CEC). We divided the OT-CEC SPs developed over the past four years into six types: ionic liquids, nanoparticle materials, microporous materials, biomaterials, non-nanopolymers, and others, to mainly introduce their characteristics in chiral drug separation. There also added a few classic SPs that occurred within ten years as supplements to enrich the features of each SP. Additionally, we discuss their applications in metabolomics, food, cosmetics, environment, and biology as analytes in addition to chiral drugs. OT-CEC plays an increasingly significant role in chiral separation and may promote the development of capillary electrophoresis (CE) combined with other instruments in recent years, such as CE with mass spectrometry (CE/MS) and CE with ultraviolet light detector (CE/UV)